survival tree
Soft decision trees for survival analysis
Consolo, Antonio, Amaldi, Edoardo, Carrizosa, Emilio
Decision trees are popular in survival analysis for their interpretability and ability to model complex relationships. Survival trees, which predict the timing of singular events using censored historical data, are typically built through heuristic approaches. Recently, there has been growing interest in globally optimized trees, where the overall tree is trained by minimizing the error function over all its parameters. We propose a new soft survival tree model (SST), with a soft splitting rule at each branch node, trained via a nonlinear optimization formulation amenable to decomposition. Since SSTs provide for every input vector a specific survival function associated to a single leaf node, they satisfy the conditional computation property and inherit the related benefits. SST and the training formulation combine flexibility with interpretability: any smooth survival function (parametric, semiparametric, or nonparametric) estimated through maximum likelihood can be used, and each leaf node of an SST yields a cluster of distinct survival functions which are associated to the data points routed to it. Numerical experiments on 15 well-known datasets show that SSTs, with parametric and spline-based semiparametric survival functions, trained using an adaptation of the node-based decomposition algorithm proposed by Consolo et al. (2024) for soft regression trees, outperform three benchmark survival trees in terms of four widely-used discrimination and calibration measures. SSTs can also be extended to consider group fairness.
PISA: An AI Pipeline for Interpretable-by-design Survival Analysis Providing Multiple Complexity-Accuracy Trade-off Models
Schlender, Thalea, Romme, Catharina J. A., van der Linden, Yvette M., van Lonkhuijzen, Luc R. C. W., Bosman, Peter A. N., Alderliesten, Tanja
Survival analysis is central to clinical research, informing patient prognoses, guiding treatment decisions, and optimising resource allocation. Accurate time-to-event predictions not only improve quality of life but also reveal risk factors that shape clinical practice. For these models to be relevant in healthcare, interpretability is critical: predictions must be traceable to patient-specific characteristics, and risk factors should be identifiable to generate actionable insights for both clinicians and researchers. Traditional survival models often fail to capture non-linear interactions, while modern deep learning approaches, though powerful, are limited by poor interpretability. We propose a Pipeline for Interpretable Survival Analysis (PISA) - a pipeline that provides multiple survival analysis models that trade off complexity and performance. Using multiple-feature, multi-objective feature engineering, PISA transforms patient characteristics and time-to-event data into multiple survival analysis models, providing valuable insights into the survival prediction task. Crucially, every model is converted into simple patient stratification flowcharts supported by Kaplan-Meier curves, whilst not compromising on performance. While PISA is model-agnostic, we illustrate its flexibility through applications of Cox regression and shallow survival trees, the latter avoiding proportional hazards assumptions. Applied to two clinical benchmark datasets, PISA produced interpretable survival models and intuitive stratification flowcharts whilst achieving state-of-the-art performances. Revisiting a prior departmental study further demonstrated its capacity to automate survival analysis workflows in real-world clinical research.
Enhanced Survival Trees
Zhou, Ruiwen, Xie, Ke, Liu, Lei, Xu, Zhichen, Ding, Jimin, Su, Xiaogang
We introduce a new survival tree method for censored failure time data that incorporates three key advancements over traditional approaches. First, we develop a more computationally efficient splitting procedure that effectively mitigates the end-cut preference problem, and we propose an intersected validation strategy to reduce the variable selection bias inherent in greedy searches. Second, we present a novel framework for determining tree structures through fused regularization. In combination with conventional pruning, this approach enables the merging of non-adjacent terminal nodes, producing more parsimonious and interpretable models. Third, we address inference by constructing valid confidence intervals for median survival times within the subgroups identified by the final tree. To achieve this, we apply bootstrap-based bias correction to standard errors. The proposed method is assessed through extensive simulation studies and illustrated with data from the Alzheimer's Disease Neuroimaging Initiative (ADNI) study.
End-Cut Preference in Survival Trees
The end-cut preference (ECP) problem, referring to the tendency to favor split points near the boundaries of a feature's range, is a well-known issue in CART (Breiman et al., 1984). ECP may induce highly imbalanced and biased splits, obscure weak signals, and lead to tree structures that are both unstable and difficult to interpret. For survival trees, we show that ECP also arises when using greedy search to select the optimal cutoff point by maximizing the log-rank test statistic. To address this issue, we propose a smooth sigmoid surrogate (SSS) approach, in which the hard-threshold indicator function is replaced by a smooth sigmoid function. We further demonstrate, both theoretically and through numerical illustrations, that SSS provides an effective remedy for mitigating or avoiding ECP.
Node Splitting SVMs for Survival Trees Based on an L2-Regularized Dipole Splitting Criteria
Maung, Aye Aye, Lazar, Drew, Zheng, Qi
This paper proposes a novel, node-splitting support vector machine (SVM) for creating survival trees. This approach is capable of non-linearly partitioning survival data which includes continuous, right-censored outcomes. Our method improves on an existing non-parametric method, which uses at most oblique splits to induce survival regression trees. In the prior work, these oblique splits were created via a non-SVM approach, by minimizing a piece-wise linear objective, called a dipole splitting criterion, constructed from pairs of covariates and their associated survival information. We extend this method by enabling splits from a general class of non-linear surfaces. We achieve this by ridge regularizing the dipole-splitting criterion to enable application of kernel methods in a manner analogous to classical SVMs. The ridge regularization provides robustness and can be tuned. Using various kernels, we induce both linear and non-linear survival trees to compare their sizes and predictive powers on real and simulated data sets. We compare traditional univariate log-rank splits, oblique splits using the original dipole-splitting criterion and a variety of non-linear splits enabled by our method. In these tests, trees created by non-linear splits, using polynomial and Gaussian kernels show similar predictive power while often being of smaller sizes compared to trees created by univariate and oblique splits. This approach provides a novel and flexible array of survival trees that can be applied to diverse survival data sets.
SurvReLU: Inherently Interpretable Survival Analysis via Deep ReLU Networks
Sun, Xiaotong, Qiu, Peijie, Zhang, Shengfan
Survival analysis models time-to-event distributions with censorship. Recently, deep survival models using neural networks have dominated due to their representational power and state-of-the-art performance. However, their "black-box" nature hinders interpretability, which is crucial in real-world applications. In contrast, "white-box" tree-based survival models offer better interpretability but struggle to converge to global optima due to greedy expansion. In this paper, we bridge the gap between previous deep survival models and traditional tree-based survival models through deep rectified linear unit (ReLU) networks. We show that a deliberately constructed deep ReLU network (SurvReLU) can harness the interpretability of tree-based structures with the representational power of deep survival models. Empirical studies on both simulated and real survival benchmark datasets show the effectiveness of the proposed SurvReLU in terms of performance and interoperability. The code is available at \href{https://github.com/xs018/SurvReLU}{\color{magenta}{ https://github.com/xs018/SurvReLU}}.
Optimal Sparse Survival Trees
Zhang, Rui, Xin, Rui, Seltzer, Margo, Rudin, Cynthia
Interpretability is crucial for doctors, hospitals, pharmaceutical companies and biotechnology corporations to analyze and make decisions for high stakes problems that involve human health. Tree-based methods have been widely adopted for \textit{survival analysis} due to their appealing interpretablility and their ability to capture complex relationships. However, most existing methods to produce survival trees rely on heuristic (or greedy) algorithms, which risk producing sub-optimal models. We present a dynamic-programming-with-bounds approach that finds provably-optimal sparse survival tree models, frequently in only a few seconds.
Optimal Survival Trees: A Dynamic Programming Approach
Huisman, Tim, van der Linden, Jacobus G. M., Demiroviฤ, Emir
Survival analysis studies and predicts the time of death, or other singular unrepeated events, based on historical data, while the true time of death for some instances is unknown. Survival trees enable the discovery of complex nonlinear relations in a compact human comprehensible model, by recursively splitting the population and predicting a distinct survival distribution in each leaf node. We use dynamic programming to provide the first survival tree method with optimality guarantees, enabling the assessment of the optimality gap of heuristics. We improve the scalability of our method through a special algorithm for computing trees up to depth two. The experiments show that our method's run time even outperforms some heuristics for realistic cases while obtaining similar out-of-sample performance with the state-of-the-art.
NSOTree: Neural Survival Oblique Tree
Survival analysis is a statistical method employed to scrutinize the duration until a specific event of interest transpires, known as time-to-event information characterized by censorship. Recently, deep learning-based methods have dominated this field due to their representational capacity and state-of-the-art performance. However, the black-box nature of the deep neural network hinders its interpretability, which is desired in real-world survival applications but has been largely neglected by previous works. In contrast, conventional tree-based methods are advantageous with respect to interpretability, while consistently grappling with an inability to approximate the global optima due to greedy expansion. In this paper, we leverage the strengths of both neural networks and tree-based methods, capitalizing on their ability to approximate intricate functions while maintaining interpretability. To this end, we propose a Neural Survival Oblique Tree (NSOTree) for survival analysis. Specifically, the NSOTree was derived from the ReLU network and can be easily incorporated into existing survival models in a plug-and-play fashion. Evaluations on both simulated and real survival datasets demonstrated the effectiveness of the proposed method in terms of performance and interpretability.
Optimal survival trees ensemble
Gul, Naz, Faiz, Nosheen, Brawn, Dan, Kulakowski, Rafal, Khan, Zardad, Lausen, Berthold
Recent studies have adopted an approach of selecting accurate and diverse trees based on individual or collective performance within an ensemble for classification and regression problems. This work follows in the wake of these investigations and considers the possibility of growing a forest of optimal survival trees. Initially, a large set of survival trees are grown using the method of random survival forest. The grown trees are then ranked from smallest to highest value of their prediction error using out-of-bag observations for each respective survival tree. The top ranked survival trees are then assessed for their collective performance as an ensemble. This ensemble is initiated with the survival tree which stands first in rank, then further trees are tested one by one by adding them to the ensemble in order of rank. A survival tree is selected for the resultant ensemble if the performance improves after an assessment using independent training data. This ensemble is called an optimal trees ensemble (OSTE). The proposed method is assessed using 17 benchmark datasets and the results are compared with those of random survival forest, conditional inference forest, bagging and a non tree based method, the Cox proportional hazard model. In addition to improve predictive performance, the proposed method reduces the number of survival trees in the ensemble as compared to the other tree based methods. The method is implemented in an R package called "OSTE".